14 research outputs found

    Learning Speech Emotion Representations in the Quaternion Domain

    Get PDF
    The modeling of human emotion expression in speech signals is an important, yet challenging task. The high resource demand of speech emotion recognition models, combined with the the general scarcity of emotion-labelled data are obstacles to the development and application of effective solutions in this field. In this paper, we present an approach to jointly circumvent these difficulties. Our method, named RH-emo, is a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling the use of quaternion-valued networks for speech emotion recognition tasks. RH-emo is a hybrid real/quaternion autoencoder network that consists of a real-valued encoder in parallel to a real-valued emotion classifier and a quaternion-valued decoder. On the one hand, the classifier permits to optimize each latent axis of the embeddings for the classification of a specific emotion-related characteristic: valence, arousal, dominance and overall emotion. On the other hand, the quaternion reconstruction enables the latent dimension to develop intra-channel correlations that are required for an effective representation as a quaternion entity. We test our approach on speech emotion recognition tasks using four popular datasets: Iemocap, Ravdess, EmoDb and Tess, comparing the performance of three well-established real-valued CNN architectures (AlexNet, ResNet-50, VGG) and their quaternion-valued equivalent fed with the embeddings created with RH-emo. We obtain a consistent improvement in the test accuracy for all datasets, while drastically reducing the resources' demand of models. Moreover, we performed additional experiments and ablation studies that confirm the effectiveness of our approach. The RH-emo repository is available at: https://github.com/ispamm/rhemo.Comment: Paper Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

    Full text link
    The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dual-mic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and SELDNet for SELD. This report is aimed at providing all needed information to participate in the L3DAS21 Challenge, illustrating the details of the L3DAS21 dataset, the challenge tasks and the baseline models.Comment: Documentation paper for the L3DAS21 Challenge for IEEE MLSP 2021. Further information on www.l3das.com/mlsp202

    Euclid Near Infrared Spectrometer and Photometer instrument concept and first test results obtained for different breadboards models at the end of phase C

    Get PDF
    The Euclid mission objective is to understand why the expansion of the Universe is accelerating through by mapping the geometry of the dark Universe by investigating the distance-redshift relationship and tracing the evolution of cosmic structures. The Euclid project is part of ESA's Cosmic Vision program with its launch planned for 2020 (ref [1]). The NISP (Near Infrared Spectrometer and Photometer) is one of the two Euclid instruments and is operating in the near-IR spectral region (900- 2000nm) as a photometer and spectrometer. The instrument is composed of: - a cold (135K) optomechanical subsystem consisting of a Silicon carbide structure, an optical assembly (corrector and camera lens), a filter wheel mechanism, a grism wheel mechanism, a calibration unit and a thermal control system - a detection subsystem based on a mosaic of 16 HAWAII2RG cooled to 95K with their front-end readout electronic cooled to 140K, integrated on a mechanical focal plane structure made with molybdenum and aluminum. The detection subsystem is mounted on the optomechanical subsystem structure - a warm electronic subsystem (280K) composed of a data processing / detector control unit and of an instrument control unit that interfaces with the spacecraft via a 1553 bus for command and control and via Spacewire links for science data This presentation describes the architecture of the instrument at the end of the phase C (Detailed Design Review), the expected performance, the technological key challenges and preliminary test results obtained for different NISP subsystem breadboards and for the NISP Structural and Thermal model (STM)

    Psychometric Properties and Correlates of Precarious Manhood Beliefs in 62 Nations

    Get PDF
    Precarious manhood beliefs portray manhood, relative to womanhood, as a social status that is hard to earn, easy to lose, and proven via public action. Here, we present cross-cultural data on a brief measure of precarious manhood beliefs (the Precarious Manhood Beliefs scale [PMB]) that covaries meaningfully with other cross-culturally validated gender ideologies and with country-level indices of gender equality and human development. Using data from university samples in 62 countries across 13 world regions (N = 33,417), we demonstrate: (1) the psychometric isomorphism of the PMB (i.e., its comparability in meaning and statistical properties across the individual and country levels); (2) the PMB’s distinctness from, and associations with, ambivalent sexism and ambivalence toward men; and (3) associations of the PMB with nation-level gender equality and human development. Findings are discussed in terms of their statistical and theoretical implications for understanding widely-held beliefs about the precariousness of the male gender role

    L3DAS21 challenge: machine learning for 3D audio signal processing

    No full text
    The L3DAS21 Challenge11www.13das.com/mlsp2021 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dualmic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-The-Art architectures: FaSNet for SE and SELDnet for SELD

    Gendered Self-Views Across 62 Countries: A Test of Competing Models

    No full text
    Social role theory posits that binary gender gaps in agency and communion should be larger in less egalitarian countries, reflecting these countries’ more pronounced sex-based power divisions. Conversely, evolutionary and self-construal theorists suggest that gender gaps in agency and communion should be larger in more egalitarian countries, reflecting the greater autonomy support and flexible self-construction processes present in these countries. Using data from 62 countries ( N = 28,640), we examine binary gender gaps in agentic and communal self-views as a function of country-level objective gender equality (the Global Gender Gap Index) and subjective distributions of social power (the Power Distance Index). Findings show that in more egalitarian countries, gender gaps in agency are smaller and gender gaps in communality are larger. These patterns are driven primarily by cross-country differences in men’s self-views and by the Power Distance Index (PDI) more robustly than the Global Gender Gap Index (GGGI). We consider possible causes and implications of these findings.</p
    corecore